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PATENT 

Attorney Docket No.: 021202-003910US 
Client Reference No.: QST-041 CV US 

RECONFIGURABLE BIT-MANIPULATION NODE 

CROSS-REFERENCES TO RELATED APPLICATIONS 
[0001] The present application claims the benefit of priority under 35 U.S.C. § 1 19 

from U.S. Provisional Patent Application Serial No. 60/418,019, entitled 
5 "RECONFIGURABLE BIT-MANIPULATION NODE", filed on October 1 1 , 2002, the 
disclosure of which is hereby incorporated by reference in its entirety for all purposes. 

BACKGROUND OF THE INVENTION 
[0002] The present invention generally relates to a device for providing bit 

manipulation and, more specifically, to a reconfigurable bit-manipulation node. 

10 [0003] There are two basic varieties of bit manipulation. The first type is single bit. 

In single bit, each bit represents a "hard decision" or, in other words, a "1" or "0". These 
individual hard decision bits are often found in the transmit portions of communications 
systems among many others. The second type is multi-bit or "soft decision". Soft decision 
bits come in many bit widths. Soft decision is common in the receive portions of 

15 communications systems where the sampled bit is not known to be a "1" or "0" until 
processing has been completed. 

[0004] It would be desirable to have a reconfigurable or programmable bit 

manipulation node that is capable of providing high performance processing for hard and soft 
decision data as well as the ability to implement different processing functions on bits when 
20 desired. 

BRIEF SUMMARY OF THE INVENTION 
[0005] A reconfigurable bit-manipulation node is disclosed. The node includes an 

execution unit configured to perform a number of bit-oriented functions and a control unit 
configured to control the execution unit to allow one of the bit-oriented functions to be 
25 performed. The bit-oriented functions include, for example, Viterbi decoding, turbo 

decoding, variable length encoding and decoding, scrambling, cyclical redundancy check and 
convolutional encoding. 

[0006] The execution unit includes a number of elements interconnected with one 

another to allow the bit-oriented functions to be performed. The elements includes a 
30 programmable butterfly unit, a number of non-programmable butterfly units, a number of 



data path elements, a look-up-table memory and a reorder memory. The execution unit is 
capable of engaging in one of a number of operating modes to perform the bit-oriented 
functions. The operating modes include a programmable mode and a number of fixed 
operating modes. 

5 [0007] The fixed operating modes include a Viterbi mode, a soft-in-soft-out mode 

(turbo decoder), a variable length encoding mode and a variable decoding mode. When 
engaged in the programmable mode, the execution unit does not utilize any of the non- 
programmable butterfly units. When engaged in the Viterbi mode, the execution unit utilizes 
both the programmable butterfly unit and the non-programmable butterfly units and uses the 

10 look-up-table memory as a path metric memory and the reorder memory as a trace back 
memory. When engaged in the soft-in-soft-out mode, the execution unit utilizes both the 
programmable butterfly unit and three of a number of non-programmable butterfly units. 
Finally, when engaged in the variable length encoding mode or the variable length decoding 
mode, the execution unit only uses a subset of operations available from the programmable 

1 5 butterfly unit. 

[0008] The data path elements include a programmable shifter and a programmable 

combiner. The programmable shifter is programmable on a cycle-by-cycle basis and 
configured to perform an exclusive-or function on multiple shifted versions of its inputs. The 
programmable shifter is further programmable to implement a parallel linear feedback shift 
20 register which may be maskable. The programmable combiner is configured to perform 

packing on an input having variable input lengths to generate an output word having variable 
output lengths. The programmable combiner is further configured to perform bit interlacing 
and bit puncturing. Packing, bit interlacing and bit puncturing can be performed 
concurrently. 

25 [0009] The bit-oriented functions are used to handle a number of channel coding 

schemes including error detecting cyclic codes, error detecting and correcting Hamming 
codes and single burst error correcting Fire codes. 

[0010] Reference to the remaining portions of the specification, including the 

drawings and claims, will realize other features and advantages of the present invention. 
30 Further features and advantages of the present invention, as well as the structure and 

operation of various embodiments of the present invention, are described in detail below with 
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respect to accompanying drawings, like reference numbers indicate identical or functionally 
similar elements. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0011] FIG. 1 A is a simplified block diagram illustrating one exemplary embodiment 

5 of a reconfigurable bit-manipulation node in accordance with the present invention; 

[0012] FIG. IB is a simplified block diagram illustrating another exemplary 

embodiment of the reconfigurable bit-manipulation node in accordance with the present 
invention; 

[0013] FIG. 2 is a simplified block diagram illustrating an exemplary embodiment of 

10 an execution unit in accordance with the present invention; 

[0014] FIG. 3 is a simplified block diagram illustrating an exemplary embodiment of 

an unpacker in accordance with the present invention; 

[0015] FIG. 4 is a simplified block diagram illustrating an exemplary embodiment of 

a register file in accordance with the present invention; 

15 [0016] FIG. 5 is a simplified block diagram illustrating an exemplary embodiment of 

a combiner in accordance with the present invention; 

[0017] FIG. 6A is a simplified block diagram illustrating a data path of an exemplary 

embodiment of a programmable butterfly in accordance with the present invention; 

[0018] FIG. 6B is a simplified block diagram illustrating an exemplary embodiment 

20 of a non-programmable butterfly in accordance with the present invention; 

[0019] FIG. 7 A is a simplified block diagram illustrating a MAX STAR operation; 

[0020] FIG. 7B is a simplified block diagram illustrating a MAX STAR-STAR 

operation; 

[0021] FIG. 8 A is a simplified block diagram illustrating an exemplary embodiment 

25 of a control unit in accordance with the present invention; 

[0022] FIG. 8B is a simplified block diagram illustrating control of state bits 

according to one exemplary embodiment of the present invention; 

[0023] [0024] FIG. 9 is a simplified block diagram of an exemplary embodiment of a 

programmable pattern generator in accordance with the present invention; 
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[0025] FIG. 1 OA is a simplified block diagram illustrating how state table(s) is 

accessed according to one exemplary embodiment of the present invention; 

[0026] FIG. 1 OB is a simplified block diagram illustrating how state bit table counters 

are used to access state table(s) according to one exemplary embodiment of the present 
5 invention; 

[0027] FIG. 1 1 is a simplified block diagram illustrating a data path of an exemplary 

embodiment of the control unit in accordance with the present invention; 

[0028] FIG. 12 is a simplified block diagram illustrating how fixed pattern control is 

provided according to one exemplary embodiment of the present invention; 

10 [0029] FIG. 13 is a simplified schematic diagram illustrating a linear feedback shift 

register for the generator polynomial used for the GSM (224, 1 84) Fire code according to one 
exemplary embodiment of the present invention; 

[0030] FIG. 14 is a simplified block diagram of an encoder; 

[0031] FIG. 15 is a simplified block diagram showing an exemplary parallel hardware 

1 5 implementation of a shifter in accordance with one exemplary embodiment of the present 
invention; 

[0032] FIG. 16 is a simplified block diagram illustrating an exemplary embodiment of 

a shifter in accordance with the present invention; and 

[0033] FIG. 17 is a simplified block diagram illustrating an expander in accordance 

20 with one exemplary embodiment of the present invention; and 

[0034] FIG. 18 is a simplified block diagram illustrating an exemplary embodiment of 

a maskable LFSR in accordance with the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 
[0035] The present invention in the form of one or more exemplary embodiments will 

25 now be described. The purpose of a RBN (Reconfigurable Bit-manipulation Node) is to 
provide ASIC (Application Specific Integrated Circuit) comparable performance for bit- 
focused operations while maintaining a reasonable level of programmability or 
reconfigurability. The reconfigurability can be on an algorithm, task, sub-task, or even a bit 
basis. Since many bit-oriented functions require significant processing on a DSP (Digital 

30 Signal Processor) or microprocessor, the addition of the RBN to an ACM (Adaptive 
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Computing Machine) is beneficial. In an exemplary embodiment, some of the bit-oriented 
functions performed by the RBN include: Viterbi decoding, turbo decoding, VL (Variable 
Length) encoding and decoding. In addition, the RBN supports many other functions such as 
scrambling, CRC (Cyclical Redundancy Check) and convolutional encoding. These various 
5 functions performed by the RBN will be further described below. 

Overview of the RBN 

[0036] FIG. 1 A is a simplified block diagram illustrating one exemplary embodiment 

of the RBN 10 in accordance with the present invention. In one exemplary embodiment as 
shown in FIG. 1 A, the RBN 10 is separated into two main sections, namely, an EU 
10 (Execution Unit) 12 and an EU control unit 14. The EU 12 further includes butterfly units 
and data path elements that perform processing functions and provide storage or 
interconnections. The EU control unit 14 includes elements that provide for sequencing, 
function selection, and interconnect selection in support of the EU 12. The EU control unit 
14 also implements the control connections to a node wrapper. 

1 5 [0037] In one exemplary embodiment, the EU 12 is made up of five major blocks. 

These five major blocks include: (1) a programmable butterfly unit 16; (2) a number of 
butterfly units #2-4 18; (3) a LUT(Look-Up Table) RAM (Random Access Memory) 20; (4) 
a reorder RAM 22; and (5) a number of data path elements or operators 24. 

[0038] In one exemplary aspect, the RBN 10 is capable of engaging in a number of 

20 operating modes including one (1) Programmable Mode and four (4) fixed operating modes. 
The four fixed operating modes are: (1) Viterbi Mode; (2) SISO (Soft In Soft Out) Mode; (3) 
VL Encoding Mode; and (4) VL Decoding Mode. As shown in FIG. 1 A, all the modes 
receive input data from nodal memory ports, labeled X, and Y, and the outputs are sent to the 
node wrapper via a node output 26. FIG. IB is an illustrative diagram showing an alternative 
25 exemplary embodiment of the RBN 10 in accordance with the present invention. 

Data Path Description 

[0039] FIG. 2 is a simplified block diagram illustrating an exemplary embodiment of 

the EU 12 in accordance with the present invention. The data path for the EU 12 includes 
data path or functional elements, interconnection elements and storage elements. The data 
30 path is 16-bit wide and the data path elements operate on 16-bit or 8-bit data. Where 8-bit 
data is used, the data is chosen as the lowest bytes of a 16-bit word. Wherever possible, the 
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data path elements are designed to achieve multiple bit operations per clock cycle. For 
example, rather than using 16 single-bit XORs, a 16-bit XOR is implemented. 

[0040] The interconnect within the EU 12 is implemented using multiplexers. In one 

exemplary implementation, there are four types of multiplexers: (1) 8-bit four-to-one (4:1) 
5 multiplexers; (2) 16-bit four-to-one multiplexers; (3) 16-bit 16-to-one (16:1) full multiplexers 
which are fixed during processing of an entire task; and (4) 16-bit 16-to-one (16:1) full 
multiplexers that can change with each clock cycle. The four-to-one multiplexers are used 
for functions that typically get their inputs from the same source(s). The full multiplexers 
allow any of the primary data path element outputs to be used as an input. 

10 [0041] In one exemplary embodiment, the storage elements in the EU 12 include the 

reorder RAM 22 and the LUT RAM 20. The LUT RAM 20 is usually used for accessing 
table-type data that is typically fixed for an entire task. The reorder RAM 22 is for data that 
is input to the RBN 10 or created by the RBN 10 during the task. The data in the reorder 
RAM 22 is usually either used later by the RBN 10 or output by the RBN 10 during the 

15 current task. Each of the data path elements or operators 24 in the EU 12 will be further 
described below. 

Data Path Elements/Operators 

Unpacker 

[0042] The unpacker provides the ability to unpack 32-bit words into 16-bit, 8-bit or 

20 4-bit words. The data path operates on 16-bit and 8-bit words and the unpacker allows the 
memory to be used efficiently. The unpacker includes an ALU, which is used in Viterbi for 
the branch metric calculation, as well as for some other calculations, on the inputs in the 
programmable mode. The unpacker provides for some basic depuncturing, registering, and 
sign extension as well. An exemplary embodiment of the unpacker is shown in FIG. 3. 

25 Register File 

[0043] The register file is thirty- two words deep and sixteen bits wide (32 x 16). The 

register file is used to provide storage of intermediate data which will be needed at a later 
time. The register file can be used as a FIFO (First-In-First-Out). When used as a FIFO, the 
register file provides the ability to equalize pipeline delays in the RBN data path. In most 
30 applications, when used as a FIFO, the register file is given a fixed delay in number of clock 
cycles, and any data words written into it will be read out based on a selected number of 



clock cycles later. In Viterbi mode, the register file has additional features as part of a trace 
back circuitry. An exemplary embodiment of the register file is shown in FIG. 4. 

Shifter 

[0044] The shifter XORs up to eight (8) shifted versions of an 8-bit input word in a 

5 single cycle. The shifter is used to implement functions such as LFSRs (Linear Feedback 
Shift Registers), convolutional encoders, scramblers and Galois multiplication. An 
exemplary embodiment of the shifter will be further described below. 

[0045] The shifter data path input is an 8-bit word. The control is a 15-bit control 

word. The shifter output is an unregistered 8-bit word which in turn is an input to an 
10 expander. The expander combines the 8-bit shifter outputs into 16-bit words. The expander 
also provides masking and XOR accumulation. The output from the expander is a registered 
16-bit word. 

Combiner 

[0046] The combiner packs bits, bytes and words into 32-bit words for efficient 

15 output to the node wrapper. The combiner accepts one or two 16-bit words. Selections are 
made which indicate how many bits on the input word or words are to be part of the output. 
Selections are also made to specify how the input bits will be packed into the output word. 
The combiner has the capability to perform bit interlacing and bit puncturing. An exemplary 
embodiment of the combiner is shown in FIG. 5. 

20 Programmable Butterfly Unit #1 

[0047] FIG. 6A is a simplified block diagram illustrating the data path of an 

exemplary embodiment of the programmable butterfly unit 16 in accordance with the present 
invention. The programmable butterfly unit 16 is used in the Viterbi Mode. The SISO Mode 
also makes use of the add-compare-select logic that can be performed by the programmable 
25 butterfly unit 16. The butterfly operation implements four adds and two compare select 
operations. The inputs to the two compare select operations are the outputs of the four 
adders. The adders also provide subtraction capability. FIG. 6B is a simplified block 
diagram illustrating an exemplary embodiment of the non-programmable butterfly 18 in 
accordance with the present invention. 
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[0048] In Viterbi Mode, the inputs to the adders are the branch metrics and path 

metrics and the selector selects the larger values. In contrast, in SISO Mode, the selector may 
select the smaller values. In Programmable Mode, the programmable butterfly unit 16 
functions are usable for other applications as needed. The programmable butterfly unit 16 
5 further includes a number of elements which will be described below. 

ALU 

[0049] There are several functions in the RBN data path that provide ALU 

(Arithmetic Logic Unit) type functions. The ALUs have two 16-bit input words A and B. 
The primary ALUs implement sixteen (16) operations which include A + B, A - B, A OR B, 
1 0 A AND B, and others. 

[0050] In the ALU, the 16-bit B word can be inverted using a toggle bit that can be 

changed on a clock cycle basis. The ALU computes a 16-bit output word that is registered. 

ADS 

[0051] The ADS (ADder Subtracter) is an ALU type function. The ADS has two 16- 

15 bit input words A and B. While the ALU implements sixteen (16) operations, the ADS 
implements six (6). The six ADS operations are A + B, A - B, A, B, NOT B and ZERO. 
Like the ALU, the ADS computes a 16-bit output word which is registered. 

MMX 

[0052] The MMX (Minimum MaXimum) provides the compare and selection 

20 operation. The MMX has two 16-bit input words A and B. The MMX implements one of 
four possible operations. The four operations are MAX(A,B), MIN(A,B), A, and B. The 
MAX function compares A and B and then outputs the larger of the two values. The MIN 
function compares A and B and then outputs the smaller of the two values. The A function 
outputs A. The B function outputs B. The A and B functions are useful for the cases where 
25 passing one of the inputs to the output is preferable. The MMX computes a 16-bit output 
word which is registered. 

LUTRAM 

[0053] The LUT RAM 20 has three primary uses. In Viterbi Mode, the LUT RAM 

20 contains the path metric data. In Programmable Mode, the LUT RAM is actually two 256 
30 word by 1 6-bit RAMs. In Programmable Mode, the LUT RAM 20 is used as either part of 



the control path or part of the data path. As part of the data path, the LUT RAM 20 is used as 
a LUT (Look Up Table). When used in this manner, the LUT RAM 20 outputs a 16-bit word 
which is addressed by the 8-bit input. The LUT RAM 20 is used as a LUT in the SISO, VL 
Encoding and VL Decoding modes. As part of the control path, the LUT RAM 20 can be 
5 used to output 16-bit control words for other functions. For example, when the reorder RAM 
22 is used for bit or word interleaving, the address can be sourced from the LUT RAM 20. 

Reorder RAM 

[0054] The reorder RAM 22 is 4K words by 16-bits. In Viterbi Mode, the reorder 

RAM 22 is used to store the trace back data. In Programmable Mode, the reorder RAM 22 is 

10 written sixteen (16) bits at a time. The reorder RAM 22 can be read either as 16-bit words, 8- 
bit words or a single bit at a time. The reorder RAM 22 has the capability to combine single 
bits into 8-bit or 16-bit words and combine bytes into 16-bit words. If the single bits or bytes 
are not combined into words, the accessed bit or byte will be found in the least significant 
byte or bit in the output word. Some applications use the reorder RAM 22 for storing 

1 5 intermediate data or temporary variables. 

[0055] Both the read and write addresses for the reorder RAM 22 can be sourced 

from either the LUT RAM 20 from control counters or from some small patterns in the 
control path. If a word interleaver were being implemented using the reorder RAM 22, 
words would typically be written in order into the reorder RAM 22 using a control counter as 
20 the write address. If the number of words is small (for example, 8 or fewer), then the small 
pattern can be used for the read address. If the number of words is moderate (for example, 9 
to 256), then the LUT RAM 20 can be used to source the read address. For larger 
interleavers, the node memory is used to source the read address. The same applies to byte 
and bit interleaving. 

25 Mode Description 

[0056] The Programmable Mode has access to all of the RBN functions with 

exception of the butterfly units #2-4 1 8. The Programmable Mode uses the EU control unit 
14 to set and toggle control bits to the data path functions and data path connections. The 
Programmable Mode can be set up to provide a wide range of bit-oriented operations. 

30 [0057] The Viterbi Mode employs all of the RBN functions but only uses a small part 

of the programmable "other" functions. The Viterbi Mode uses the LUT RAM 20 as a path 
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metric RAM and uses the reorder RAM 22 as a trace back RAM. In the Viterbi mode, the 
RBN 10 computes four butterfly operations per clock cycle. The Viterbi Mode involves 
some specialized control functionality that is not available to the other modes. 

[0058] The SISO Mode is part of a turbo decoder. The SISO Mode involves an 

5 operation call MAX STAR. FIG. 7A is an illustrative diagram showing the MAX STAR 
operation. The MAX STAR operation involves two add compare select computations along 
with a LUT access and an addition. Optionally, the SISO Mode also involves an operation 
called MAX STAR-STAR. FIG. 7B is an illustrative diagram showing the MAX STAR- 
STAR operation. The SISO Mode employs all of the RBN functions. The SISO Mode also 
10 involves some specialized control functionality that is not available to the other modes. In 
the SISO Mode, the RBN 10 computes two MAX STAR operations per clock cycle. 

[0059] The last two modes are the VL Encoding Mode and VL Decoding Mode. 

These modes use a small portion of the butterfly operations but use the rest of the RBN 
functions with the exception of the shifters. Like the Viterbi and SISO Modes, these modes 
15 include specialized control functionality that is not available to the other modes. 

[0060] The EU control unit 14 is used to control the operations of the EU 12. FIG. 

8 A is a simplified block diagram illustrating an exemplary embodiment of the EU control 
unit 14 in accordance with the present invention. The primary operation of the RBN 10 is to 
step through overlapping events. The definition of an "event" is a configuration. 
20 Configurations range from setting a mode of a single data path operator to grouping several 
data path operators together to perform a single operation. The EU control unit 14 sets up a 
timed sequence of a series of overlapping and potentially repeating events. FIG. 8B is a 
simplified block diagram illustrating an alternative exemplary embodiment of the EU control 
unit 14 in accordance with the present invention. 

25 [0061] The control bits and state bits define the events, their times of occurrence (i.e. 

setup time and teardown time of each given event) and pattern (e.g., every clock two cycles) 
within a task. 

[0062] A nodal sequencer is a simple instruction based processor. The nodal 

sequencer executes code from an instruction memory. The nodal sequencer is responsible for 
30 all task switching and TPL (Task Parameters List) processing. Along with intertask 

communications, task setup and tear down, the nodal sequencer provides, if necessary, data 
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dependent branching functions. The nodal sequencer is external to the data path but has read 
and write capability for all data path registers. 

[0063] Alternatively, programmable pattern generators are capable of creating one-bit 

patterns with a variety of duty cycles. The pattern generators step through whatever pattern 
5 the nodal sequencer provides to the flip-flops during configuration. FIG. 9 is a simplified 
block diagram of an exemplary embodiment of a programmable pattern generator in 
accordance with the present invention. 

[0064] There are three main types of control bits: fixed, counter and state control bits. 

Although the RBN 10 does change dynamically during operation, many configurations are 
10 static, i.e., they do not need to change during a task. Control of the static configuration of a 
task is implemented as fixed bits. Fixed bits are set before a task and control configurations 
like multiplexer selections (i.e. interconnects) and ALU modes (e.g. sign-extension, 
arithmetic/logic operation). The fixed bits are set by the nodal sequencer. The settings for 
the fixed bits may come from either stored microcode data or from a TPL. 

1 5 [0065] A second type of control for the RBN 10 is provided by counters. Counters 

are used primarily for addressing the RAMs. The counters implement a variety of addressing 
modes but can be powerful when combined with the RBN state bits. 

[0066] State bits provide dynamic control of the data path. The state bits are 

generated from the state table(s). FIG. 10A illustrates how state bits are generated. The state 

20 table(s) is accessed by the state counters. FIG. 10B illustrates how the state table(s) is 

accessed by the state counters. The state table output is multiplexed using the fixed state 
mapping to form the state bits. The state bits are capable of changing on any clock cycle and 
generating any desired pattern. This is accomplished by state counters which sequence 
through the predefined state table. There are four (4) state counters in this implementation 

25 which allows a maximum loop depth of four (4). This means that one loop can nest inside a 
second loop etc. The nesting and looping capability of the state counters allows great 
flexibility and programmability of the state bit patterns. 

[0067] State bits also control the dynamic configurations of the RBN 10. They are 

the most complicated of the RBN control options since they are capable of changing on any, 
30 or every, clock cycle. State bits control parts of the EU 12 which are variable during the 

execution of one task, like register enables and multiplexer selections. Some state bits toggle 
only a few times for a task but are critical in the sequencing. Other state bits toggle as much 
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as every clock cycle. Several mechanisms in the EU control unit 14 are dedicated to ensure 
the correct toggling of the state bits. State bits can be sourced from events, programmable 
pattern generators, counter TCs (Terminal Counts), or counter bits. During configuration, the 
nodal sequencer sets which source feeds each state bit. 

5 [0068] During task execution, the nodal sequencer sets triggers at specific and usually 

pre-specified times. When a trigger occurs, a specific event bit (or event bits) is toggled to 
mark the setup or teardown time of a specific event. Once events are active, they can cause 
the state bits to toggle, the programmable pattern generators to start or stop, or the counters to 
count, stop counting or change direction. 

10 [0069] FIG. 1 1 is a simplified block diagram illustrating a data path of an exemplary 

embodiment of the EU control unit 14 in accordance with the present invention. In addition 
to the fixed and state bits, there are some addresses and some word length controls needed by 
the EU 12. The word length control is called R-Control. Referring to FIG. 1 1, the R-Control 
destinations include: shifter 1, shifter2, reorder RAM read, reorder RAM write, combiner 

15 Control A and combiner Control D. Also shown are the R-Control sources. TheLUTRAM 
20 performs double duty since it is sometimes part of the data path and sometimes part of the 
R-Control. The LUT RAM 20 is addressed by 8-bit counters when used as part of the R- 
control. The X-memory source is part of the nodal memory. When the X-memory is used as 
a source, the LI and L2 sources are not available to the data path. The fixed pattern sources 

20 are sometimes called the small patterns and are simply sixteen 16-bit control words. FIG. 12 
illustrates how the fixed pattern is generated. The nodal sequencer sets the control words 
during configuration. The smcountl and smcount2 sources are 16-bit up/down counters 
which can be used to address the reorder RAM 22 read and/or write ports. The fixed patterns 
and the counters are controlled by state bits. 

25 [0070] As mentioned above, the shifter in the EU 12 is used to implement functions 

such as LFSRs, convolutional encoders, scramblers and Galois multiplication. Since the 
RBN 10 is used to handle communication and signal processing, it is capable of managing the 
channel coding requirements of various wireless standards. Channel coding schemes include 
error detecting cyclic codes, error detecting and correcting Hamming codes, single burst error 

30 correcting Fire codes, and so on. Typically, these codes are represented by their generator 

polynomials. The degree of polynomials used for the various wireless standards spans a wide 
range, from degree 3 for a GSM CRC, to degree 42 for the CDMA long code, to effective 
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degrees of 64 and 128 for the GSM and Bluetooth ciphers, respectively. Much longer codes 
exist in W-CDMA. Encoders and decoders for these kinds of codes utilize LFSRs to multiply 
and divide code polynomials. Because of the large number of different codes used by the 
various wireless standards, it is impractical, in the RBN 10, to use separate LFSRs for each 
5 encoder and decoder. Under the present invention, a programmable computational element is 
implemented to perform these operations. 

[0071] LFSRs are combinations of shift register stages and mod-2 adders. Inherently, 

these are bit-oriented structures. In one exemplary embodiment, the shifter used in the RBN 
10 is one (1) byte or eight (8) bits in width. However, it should be understood that the shifter 
10 can be implemented with any number of bits. An illustrative example will be presented to 
demonstrate how the byte-oriented computational element can implement these kinds of bit- 
oriented structures. 

[0072] An LFSR for the generator polynomial used for the GSM (224, 184) Fire code 

is shown in FIG. 13. Each square with a number is a flip-flop and the modulo-2 adders 
1 5 (exclusive or gates) are the circles with plus signs. 

[0073] In the GSM (224, 1 84), a block of 1 84 information bits is protected by 40 

extra parity bits used for error detection and correction. These bits are appended to the 184 
bits to form a 224 bit sequence. The encoding of the cyclic code is performed in a systematic 
form, which means that, in the GSM (224, 184), the polynomial: 

20 d(0)x223 + d(l)x222 + ... + d(183)x40 + p(0)x39 + ... + p(38)x + p(39) 

where {d(0),d(l),...,d(183)} are the information bits and {p(0),p(l),...,p(39)} are the parity 
bits, when divided by g(x), the generator polynomial, yields a remainder equal to: 

1 + x + x2 + ... + x39. 

The block diagram for the encoder is shown in FIG. 14. 

25 [0074] For 184 clock periods, with control signal info/not jpar = 7, the information 

bits concurrently are shifted into the LFSR and out of the encoder. Then, for 40 clock 
periods, with control signal info/not _par = 0, the parity bits are shifted out of the LFSR. 

[0075] The bit-serial implementation is straightforward. With d(k) representing the 

information bits, and with r(i) representing the 40-bit LFSR: 

30 fork = 0to 183 
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r(i):=r39©d(k) fori = 0; 

r(i) := r(i-l) for i = 1, 2, 4...16, 18...22, 24, 25, 27.. .39; and 

r(i) := r(i-l) © r39 © d for i = 3, 17, 23 and 26 

5 [0076] Mapping this encoder onto the byte-oriented LFSR element requires 

processing eight information bits at one time and computing the LFSR state after eight 
consecutive shifts. 

[0077] In the case of an N-bit parallel implementation, it is necessary to process N 

information bits at one time and compute the LFSR state after N consecutive shifts. With 
d(0), d(l), d(7) representing the information byte, one can see by inspection from FIG. 14 
that the feedback byte, b(0), b(l), ... , b(7) will be: 

b(k) = d(k) © r(39-k) for k = 0 to 7 

In the N case, the input data is d(0), d(l), d(N) and the feedback data, b(0), b(l), ... , b(N) 
will be: 

b(k) = d(k) © r(39-k) for k = 0 to N 

The new LFSR state can be generated by the bit-wise modulo-2 addition of the lower (39 - 
(N-l)), or 32 bits in this example, of the LFSR and, in accordance with the feedback taps, 
five copies of the feedback data. The 8-bit version is illustrated in Table 1 below. 

20 Table 1: Update Table for LFSR State after Eight Consecutive Shifts 

Register Stage 

39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 

Inputs to Bit-wise Modulo-2 Addition Process 

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 

bO bi b2 b3 b4 b5 b6 b7 
bO bl b2 b3 b4 b5 b6 b7 

bO bl b2 b3 b4 b5 b6 b7 
bO bl b2 b3 b4 bs b6 b7 
bO bl b2 b3 b4 b5 b6 b7 

This table simply indicates the bit-wise modulo-2 additions that must be performed to update 
the LFSR after eight consecutive shifts. It is read vertically. For example, the new state for 
25 register stage 28 will be the modulo-2 addition of feedback bit b2, feedback bit b5 and the 
current value of register stage 20; the new state for register stage 23 will be the modulo-2 
addition of feedback bit bl, feedback bit b7 and the current value of register stage 15; and so 
on. From the table, it can be seen that if a bit-wise modulo-2 addition occurs with bit r(m) 
and bO then bit r(m-l) will be gated with bl and bit r(m-i) will be gated with b(i) assuming i 
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< N. Vector u is computed as the modulo-2 sum of vector b bits which are needed to update a 
given segment of the r vector. So: 

r(k) := r(k-N) 0 u(k-N) 

5 In this example, u(20) = b(2) 0 b(5) and r(28) := r(20) © u(20). Restating an earlier point but 
in terms of u, if bit bO is part of the computation for u(m) then bit bl will be part of u(m-l) 
and bit b(i) will be part of u(m-i) assuming i < N. 

[0078] In this example it is worth noting that the new states for twelve register stages 

are simply the current states of the respective register stages with an index offset of eight. 
10 That is, r(i) := r(i-8) for i = 11... 16, 34...39 

[0079] The byte-oriented implementation, too, is relatively straightforward. FIG. 15 

shows the parallel hardware implementation. The 8 bit input data (d) arrives 32 bits at a time 
from the network memory. Since most of the RBN 10 uses 16 bit data paths, the input data is 
unpacked into 8 bit words with 8 zeros added as the high bits. This data is then fed to an 

1 5 ALU configured to perform an XOR. The LFSR state is read in from the local memory. It is 
possible to store the LFSR in a local register file instead of the local memory but that is not 
shown here. After unpacking, the high byte of the LFSR data (r(39:32) in our case) is fed to 
the ALU to be XORed with d(7:0). The output of the ALU is the b(7:0) byte with 8 zeros 
added as the high bits. The b byte is not changed until the entire LFSR has been updated. 

20 Each clock, 8 new bits of the LFSR (r) are clocked from the unpacker and into a pipeline 

register. Also, on each clock, 8 new bits of the update vector (u) are computed by the shifter 
using the b byte. The expander simply expands the u byte to 16 bits which are then XORed 
in the second ALU with the r byte to form an updated byte of the LFSR. The combiner forms 
32 bit words for storage in the local memory. The 40 bit example shown here runs better out 

25 of a register file but if the LFSR bits (r) are transferred to and from local memory as 8 bits 
plus 24 zeros then the local memory version runs efficiently too. 

[0080] All the shifter does is changes the b byte into the update byte (u). Every clock 

cycle, a new set of control bits, c(14:0), arrive to convert b bytes into u bytes. Table 2 shows 
the output bit and the input used to compute it. It also shows the b bits and then the control 
30 bits necessary to compute the update byte u. From Table 2, it can be seen that for the first 
byte (byte 0) the only control bit set is CI 3. For the second byte, CI 4, C8 and C5 are set. 
For the third byte, C6 and CO are set. For the fourth byte, C12 is set. For the last byte (byte 
4), C7 and C4 are set. 
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Table 2: Update bits for GSM (224,184) 



Output 

(new r) 


Input 
(old r) 


Update bits 
(u) 


Byte,Bit 


Control Bits 


R(39) 


R(31) 




0,0 


C13(-) 


R(38) 


R(30) 


- 


0,1 


C13(-) 


R(37) 


R(29) 


- 


0,2 


C13(-) 


R(36) 


R(28) 


- 


0,3 


C13(-) 


R(35) 


R(27) 


- 


0,4 


C13(-) 


R(34) 


R(26) 


- 


0,5 


C13(-) 


R(33) 


R(25) 


BO 


0,6 


C13(B0) 


R(32) 


R(24) 


Bl 


0,7 


C13(B1) 


R(31) 


R(23) 


B2 


1,0 


C5(B2), C8(-), C14(-) 


R(30) 


R(22) 


B0,B3 


1,1 


C5(B3), C8(B0), C14(-) 


R(29) 


R(2l) 


B1,B4 


1,2 


C5(B4), C8(B1), C14(-) 


R(28) 


R(20) 


B2,B5 


1,3 


C5(B5), C8(B2), C14(-) 


R(27) 


R(19) 


B3,B6 


1,4 


C5(B6), C8(B3), C14(-) 


R(26) 


R(18) 


B4,B7 


1,5 


C5(B7), C8(B4), C14(-) 


R(25) 


R(17) 


B5 


1,6 


C5(-), C8(B5), C14(-) ] 


R(24) 


R(16) 


B0,B6 


1,7 


C5(-), C8(B6), C14(B0) 


R(23) 


R(15) 


B1,B7 


2,0 


C0(B7),C6(B1) 


R(22) 


R(14) 


B2 


2,1 


C0(-),C6(B2) 


R(21) 


R(13) 


B3 


2,2 


C0(-),C6(B3) 


R(20) 


R(12) 


B4 


2,3 


C0(-),C6(B4) 


R(19) 


R(ll) 


B5 


2,4 


C0(-),C6(B5) 


R(18) 


R(10) 


B6 


2,5 


C0(-),C6(B6) 


R(17) 


R(9) 


B7 


2,6 


C0(-),C6(B7) 


R(16) 


R(8) 


- 


2,7 


C0(-),C6(-) 


R(15) 


R(7) 


- 


3,0 


C12(-) 


R(14) 


R(6) 


- 


3,1 


C12(-) 


R(13) 


R(5) 


- 


3,2 


C12(-) 


R(12) 


R(4) 


- 


3,3 


C12(-) 


R(ll) 


R(3) 


- 


3,4 


C12(-) 


R(10) 


R(2) 


BO 


3,5 


C12(B0) 


R(9) 


R(l) 


Bl 


3,6 


C12(B1) 


R(8) 


R(0) 


B2 


3,7 


C12(B2) 


R(7) 




B0,B3 


4,0 


C4(B3),C7(B0) 


R(6) 


- 


B1,B4 


4,1 


C4(B4),C7(B1) 


R(5) 




B2,B5 


4,2 


C4(B5),C7(B2) 


R(4) 




B3,B6 


4,3 


C4(B6),C7(B3) 


R(3) 




B4,B7 


4,4 


C4(B7),C7(B4) 


R(2) 




B5 


4,5 


C4(-),C7(B5) 


R(D 




B6 


4,6 


C4(-),C7(B6) 


R(0) 




B7 


4,7 


C4(-),C7(B7) 



[0081] The special purpose shifter is shown in FIG. 16. The shifter operates on an 

input byte b[7:0] in a single cycle based on the control bits c[14:0] to compute the output 
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i[7:0]. The control bus c is sourced from the 16-bit R-control with the 16-bit c[15] being 
used in the expander as will be explained below. There are two (2) shifter expanders in the 
RBN EU which compute the 16-bit output buses SI and S2. 

[0082] The standard operation of the expander, shown in FIG. 17, is to accept the 

5 eight (8) bit output of the associated shifter (In[7:0]) and by using state bits (enable hi (enhi), 
enable lo (enlo), clear hi (clrhi), clear lo (clrlo)) and the MSB of the R-control (c[15] or 
masken), compute the outputs of two 8-bit registers which are merged to for the 16-bit output 
bus SI or S2. 

[0083] The first expander (SI) performs three (3) functions beyond its standard 

10 operation. These operations are enabled by the state bit (concat). When concat is lo, normal 
mode is used; when concat is hi, the two (2) shifter control bits (shftctl[l :0]) (bits [10:9] of 
location 26) are used to determine which of the three (3) functions is performed. The first 
function (shiftctl = 00 or 10) is a simple concatenation of the eight (8) LS bits from shifter 2 
(S2) to become the MS (most significant) eight (8) bits of shifter 1 (SI). The second function 
15 (shiftctl = 01) is a concatenation of the seven (7) LS (least significant) bits from shifter 2 (S2) 
to become the MS (most significant) seven (7) bits of shifter 1 (SI), while the eight (8) LS 
(least significant) bits of shifter 1 are concatenated with an LSB (least significant bit) of zero 
(0) to form the nine (9) LSBs of S 1 . The third function (shiftctl = 1 1) is a function of the 
eight (8) LS bits from shifter 2 (S2) which are sign extended by one (1) bit to become the MS 
20 nine (9) bits of shifter 1 (SI), while the MS seven (7) bits of the eight (8) LS bits of shifter 1 
([7:1]) form the seven (7) LSBs of SI. 

[0084] Some applications (most notably W-CDMA) have requirements for a more 

complicated LFSR function. These maskable LFSRs apply a programmable mask register 
(m) to the LFSR state (r). The resulting bits are XORed to produce a single bit per clock. 
25 FIG. 1 8 illustrates this for a sample maskable LFSR. 

[0085] Table 3 indicates the state of this LFSR after each of eight (8) consecutive 

clocks. From Table 3, it can be seen that if mask bit 17 (ml 7) is set then on clock 1, rl6 will 
be part of the first bit of the output. It follows that on clock 2, rl5 will be part of the second 
bit of the output, and that on clock 3, rl4 will be part of the third bit of the output, etc. This 
30 maps well to the programmable shifter until the eighth bit. For the eighth bit, the input bit r9 
is combined with bO. Bit r9 is not available in the first 8 bits r(17:10) and bit bO is in a 
separate word all together. In fact, since there are four input words in all: r(17:10), r(9:2), 
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r(l :0) and b(7:0). This means that the solution will require four shifter passes per eight bits 
of the output along with the three shifter passes (one for each eight bits of the LFSR (r)) for 
the 18 bit LFSR update. Table 4 shows the control necessary to provide the LFSR update. 
Table 5 shows the desired bits for the output given the setting of any bit of the mask register 
5 m. More than one mask bit is likely to be set at a time so more than one row of this table will 
be active at a time. Table 6 shows the control bits which need to be set to achieve the output 
specified in Table 5. As in Table 5, multiple rows of Table 6 are likely to be active at one 
time. 

[0086] Since the LFSR functionality and the shifter implementation are easily 

1 0 parallelized, an 1 8-bit LFSR with the maskable output could be implemented in several ways 
depending on the desired performance. For example, if it is desired to run at top speed, then 
seven shifters can implement the function in one clock cycle. If one shifter is used, it will 
require seven clock cycles to implement the function. 



Table 3: Eight Consecutive Shifts for Maskable Output 



15 



Mask 

Register : 
Time 1 : 



Time 2 : 
Time 3 : 
Time 4 : 
Time 5 : 
Time 6 : 

Time 7 : 

Time 8: 



17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 

16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 
b0 b0 



15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 
b0 bl bO bl 



bO 



bO bl 



14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 

bO bl b2 bO bl b2 bO bl b2 

13 12 11 10 09 08 07 06 05 04 03 02 01 00 

bO bl b2 b3 bO bl b2 b3 bO bl b2 b3 

12 11 10 09 08 07 06 05 04 03 02 01 00 

bO bl b2 b3 b4 bO bl b2 b3 b4 bO bl b2 b3 b4 

11 10 09 08 07 06 05 04 03 02 01 00 

bO bl b2 b3 b4 b5 bO bl b2 b3 b4 b5 

bO bl b2 b3 b4 b5 

10 09 08 07 06 05 04 03 02 01 00 

bO bl b2 b3 b4 b5 b6 bO bl b2 b3 b4 b5 b6 

bO bl b2 b3 b4 b5 b6 

09 08 07 06 05 04 03 02 01 00 

bO bl b2 b3 b4 b5 b6 b7 bO bl b2 b3 b4 b5 b6 b7 

bO bl b2 b3 b4 b5 b6 b7 



Table 4: Update bits Maskable LFSR 



Output 

(new r) 


Input 
Cold r) 


Update bits 
(«) 


Byte, Bit 


Control Bits 


R(17) 


R(9) 


BO 


0,0 


C7(B0), C12(-) 


R(16) 


R(8) 


Bl 


0,1 


C7(B1), C12(-) 


R(15) 


R(7) 


B2 


0,2 


C7(B2), C12(-) 


R(14) 


R(6) 


B3 


0,3 


C7(B3), C12(-) 


R(13) 


R(5) 


B4 


0,4 


C7(B4), C12(-) 
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Output 
(new r) 


Input 
(old r) 


Update bits 
(u) 


Byte, Bit 


Control Bits 


R(12) 


R(4) 


B5,B0 


0,5 


C7(B5), C12(B0) 


R(ll) 


R(3) 


B6,B1 


0,6 


C7(B6), C12(B1) 


R(10) 


R(2) 


B7,B2 


0,7 


C7(B7), C12(B2) 


R(9) 


R(l) 


B3 


1,0 


C4(B3), C9(-), 


R(8) 


R(0) 


B4 


1,1 


C4(B4), C9(-), 


R(7) 




B5,B0 


1,2 


C4(B5), C9(B0), 


R(6) 


_ 


B6,B1 


1,3 


C4(B6), C9(B1), 


R(5) 


- 


B7,B2 


1,4 


C4(B7), C9(B2), 


R(4) 




B3 


1,5 


C4(-), C9(B3), 


R(3) 




B4 


1,6 


C4(-), C9(B4), 


R(2) 




B5 


1,7 


C4(-), C9(B5), 


R(l) 




B6 


2,0 


C1(B6) 


R(0) 




B7 


2,1 


C1(B7) 



Table 5: Output bits of the Maskable LFSR 



Mask 


Shifter 1: 


Shifter 2: 


Shifter 3: 


Shifter 4: 


Bit(m) 


R(17:10) 


R(9:2) 


R(1:0),000000 


b(0:7) 


m(17) 


R(16:10),0 


0000000,R(9) 


00000000 


0000000,B(0) 


m(16) 


R(15:10),00 


000000,R(9:8) 


00000000 


000000,B(0:1) 


m(15) 


R(14:10),000 


00000,R(9:7) 


00000000 


00000,B(0:2) 


m(14) 


R(13:10),0000 


0000,R(9:6) 


00000000 


0000,B(0:3) 


m(13) 


R(12:10),00000 


000,R(9:5) 


00000000 


000,B(0:4) 


m(12) 


R(ll:10),000000 


00,R(9:4) 


00000000 


00,B(0:5) @ 0000000,B(0) 


m(ll) 


R(10),0000000 


0,R(9:3) 


00000000 


0,B(0:6)@ 000000,B(0:1) 


m(10) 


00000000 


R(9:2) 


00000000 


B(0:7) @ 00000,B(0:2) 


m(9) 


00000000 


R(8:2),0 


0000000,R(1) 


0000,B(0:3) 


m(8) 


00000000 


R(7:2),00 


000000,R(1:0) 


000,B(0:4) 


m(7) 


00000000 


R(6:2),000 


00000,R(1:0),0 


00,B(0:5) @ 0000000,B(0) 


m(6) 


00000000 


R(5:2),0000 


0000,R(1:0),00 


0,B(0:6) @ 0000003(0:1) 


m(5) 


00000000 


R(4:2),00000 


000,R(1:0),000 


B(0:7) @ 000003(0:2) 


m(4) 


00000000 


R(3:2),000000 


00,R(1 :0),0000 


00003(0:3) 


m(3) 


00000000 


R(2),0000000 


0,R(1:0),00000 


0003(0:4) 


m(2) 


00000000 


00000000 


R(1:0),000000 


00,B(0:5) 


m(l) 


00000000 


00000000 


R(0),0000000 


0,B(0:6) 


m(0) 


00000000 


00000000 


00000000 


B(0:7) 



Table 6: Control bits for the Maskable LFSR 



Mask Bit 


Shifter 1: 


Shifter 2: 


Shifter 3: 


Shifter 4: 


Byte3it 


(m) 


R(17:10) 


R(9:2) 


R(1:0),000000 


b(0:7) 




m(17) 


C(6) 


C(14) 




C(14) 


0,0 


m(16) 


C(5) 


C(13) 




C(13) 


0,1 


m(15) 


C(4) 


C(12) 




C(12) 


0,2 


m(14) 


C(3) 


C(ll) 




C(ll) 


0,3 


m(13) 


C(2) 


C(10) 




C(10) 


0,4 


m(12) 


C(l) 


C(9) 




C(9), C(14) 


0,5 


m(ll) 


C(0) 


C(8) 




C(8), C(13) 


0,6 
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Mask Bit 


Shifter 1: 


Shifter 2: 


Shifter 3: 


Shifter 4: 


Byte,Bit 


(m) 


R(17:10) 


R(9:2) 


R(1:0),000000 


b(0:7) 




m(10) 


- 


C(7) 


- 


C(7), C(12) 


0,7 


m(9) 


- 


C(6) 


C(14) 


C(ll) 


1,0 


m(8) 


- 


C(5) 


C(13) 


C(10) 


1,1 


m(7) 


- 


C(5) 


C(12) 


C(9), C(14) 


1,2 


m(6) 


- 


C(3) 


C(ll) 


C(8), C(13) 


1,3 


m(5) 


- 


C(2) 


C(10) 


C(7), C(12) 


1,4 


m(4) 




C(l) 


C(9) 


C(ll) 


1,5 


m(3) 




C(0) 


C(8) 


C(10) 


1,6 


m(2) 






C(7) 


C(9) 


1,7 


m(l) 






C(6) 


C(8) 


2,0 


m(0) 








C(7) 


2,1 



[0087] In one exemplary embodiment, the shifter is part of a set of elements that can 

be programmed to implement any LFSR. The byte-oriented version of the shifter can be used 
in the RBN 10. This version will perform eight bits of the LFSR per clock cycle which, 
5 obviously, is the maximum rate possible for a byte-oriented version. The byte-oriented 
version requires seven XOR gates and eight AND gates per bit. Each XOR gate is 
implemented with three gates. So, the total per bit gate count for the shifter is 8 + (3 x 7) or 8 
+ 21 or 29 gates per bit. This results in 8 x 29 or 232 total gates for the shifter. Fifteen (15) 
control bits are used to operate the shifter. Some of these control bits typically may need to 
10 be changed each clock cycle. The control bits can be sourced from a command word, a 

control state machine or from a Look Up Table (LUT) RAM. The RAM is 256x16 and is, in 
this case addressed by a state machine. The 256x16 RAM can be used to provide the control 
for an LFSR of up to 2048 bits. 

[0088] In one exemplary implementation, the present invention is implemented with 

15 control logic using computer software in either an integrated or modular manner or hardware 
or a combination of both. However, it should be understood that based on the disclosure and 
teachings provided herein, a person of ordinary skill in the art will know of other ways and/or 
methods to implement the present invention. 

[0089] It is understood that the examples and embodiments described herein are for 

20 illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of 
this application and scope of the appended claims. Accordingly, the disclosures and 
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descriptions herein are intended to be illustrative, but not limiting, of the scope of the 
invention which is set forth in the following claims. 
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